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How Rings assign address space 

Stepl: aU^incorriing address to self (to some power of 2) 
Step2: asagithe resiit to self address 

Step3: next_addr = sdfaddr + self_addr_space; //number of re^ster usedlocally 
Step4: send ctownnext_addr 



Example: 

Dim reeds 16addis 
Uart reeds 4 
Hirer needs 256 



Enure rate rress 
Addr=8 



self=32 



Aiir=36 



self =256 




Mdr-512 
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Pi 

4 



-d 



IQ 



clkl 



clk2 



If the delay between "clkl" and "clk2" 
greater then the delay from Q to d of 
second flipflop, we have a race on our 
meaning right hand flipflop will 
sample the data of Q a whole clock period 
early. 




clock runs with data 

the problem is possible race. 
However, we control the logic on 
each flipflop leaving the compound, 
because it is always the same standart 
ring-interface module, we can ensure, 
that the delay will be at least enough. 
And more importantly easily checked 
after layout. ( 
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70 



compound 



7 



data_a 

~^ — 



^j2^ clock opposes fota 

this arrangement has the advantage 
of auto ensuring the no race 
-condition (at least in this simple 



compound B 



data_b case ) exists 



1* 



clkb- 
clka 
Qa 







~~ l_ 




/ 


\ 


a 






\ 



data__a which changes after clkb, 
which is later then clka, is sampled 
by clka. NO RACE. 



7 



clka 



QaJ_| d 



T 



clkb 



90 
/ 



"A 

1? 



OK ^ 



72 




compound B 



compound A 
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ft J° latch <jb 



Qbj d 



Qb 



clkb 



1% 



Qb 



d - 



clka 



data C\% 



clock 



clock 




F'5 



ft 



clka I I 

clkb L 1 I f 1 1 I — 

data_a /I \ 

Uncertainty range 



clock 



data_a leaving the bridge goes to member "b" and 
there should be sampled by rising of clkb. clkb 
lags a lot behind clka of the bridge. As clearly seen 
from the waveforms, race is eminent. Here we 
should add latches for all the data lines (~90). 
Adding latch works however if the delay between 
clka and clkb is less then 75% of cycle time, 
otherwise the uncertainty kills the usable time. It 
sets hard limit on the number of ring members. 
Also keep in mind that latches needed on each OK 
signal between members of the ring 
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data 




clock 




ncertainty range 



Here, data_J> leaves member "b" to be sampled by 
clka in the bridge. But now clkb lags a lot behind 
clka. This actually works to our advantage. If the 
lag is smaller then better part of clock cycle. This 
solution looks better, because between adjacent 
members, we can take care to delay the datas 
beyond danger zone of clock delay, the OK signals 
are covered automatically, and last leg data is also 
covered. The only signal not safe is the OK from 
bridge to "b" member. It will need a latch in "b*\ 



big module 



F'5 



data__in 



local_clock ' 

local_data out 



data from 
previous member 



elk 




ring interface 



-^►data 



clock 



local clock lags behind 
ring_interface clock of this 
module, because we presume the 
module is big. for data_coming 
out, it is not a problem, it changes 
later then ring-i/f flipflops clock. 
However for data entering the 
module from previous member, 
the race is a possibility we must 
look into. 
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if module "a" sends a message to module "b", ring works 
fine. However if most of the traffic is from "c" to "b", 
this is more expensive in terms of latency. 



Another problem is "peak latency". Suppose that , "a" 
transmits mostly to "d" and "b" mostly to "c" In this case 
communication between "b" and "c" suffers degradation 
in case that peak traffic coincide. 
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Land bridge gets its name from the fact that it is 
a luxury. Tt spans across connected modules. 
The idea is simple. When V2 sends message to 
Dl it gets to one side of the bridge. This side 
analyzes the destination address and by some 
magic (explained later) decides to short-cut the 
path. The message re-appears at the other end of 
the bridge and gets fast to Dl. By same magic, 
message from VI to D2 get bypassed also, 
message from VI to Dl is treated directly. 




&j ■ is- 



Enumeration is started by "Anchor" 
which assigns address= 1 to itself, results 
of enumeration are labels 1 to 7. land 
bridge gets two addresses , as if it were not 
one module, there is "near" end, that got 
enumeration label "3", and the "far" end 
marked 6. 
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msgl and msg2 arrive at the same time, 
the bridge end must make a decision 
which message to forward first. 

It can be shown that unwise decision can 
lead to freezout, deadlock and option price 
dropping to 5$. 

Therefore MSG2 gets the priority. 
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Bridge takes responcibility for strays, 
but only at the "far" end. During 
enumeration, bridge is "polarized" to 
have near and far end. Near is the end 
first struck by enumeration message. 



So we have exactly one enforcer for each 
ring. 



3 near 




11 far 



F '5 



11 



In land bridge ring, the situation is trickier. If V2 
send message to address=5. The land bridge 
divert at 1 1/far end. it will re-appear at 3 and 
start cycling forever. 

We have to define an algorithm that will take 
care of all cases. 

Luckily there is a way. 

Land Bridge deals only with messages arriving 
at the far end and being diverted. It marks and 
monitors only those. Messages arriving at near 
end, keep their markings. Messages at fdar end 
going through, are left alone. 



APP ID=10064329 



Page 23 9 of 280 




APP ID= 100643 29 



Page 240 of 280 



„1, Ci O ' l ¥3, 2 "3 „, il 7 Q H O P 




APP ID= 10064329 



Page 241 of 280 



JL O Q €* Hk3 2 i! 5 . O 7' O S O iii:! 



< 
1 

E 



2^3 



20 



addr>^ 



64 data ^-2-7* 



ok 



scan test 



clkf?) 



I 



elk 



type 



ok 



idle / msgA 



during the first clock, OK remains active, when type 
is of msgA. It means that on the next clock, 
memberA may send new message, member A uses 
this ok to send msgB on the next clock, msgii gets 
stuck for a clock because OK goes inactive. It goes 
inactive because the fifo in memberB is full. One 
clock later, the fifo has a free entry, so OK returns to 
1 and type returns to idle next clock, return to idle 
could also be change to next message, if there was 
one. 



intsg iok 



ornsg ook 



1 




umsg 



uok 



APP ID= 10064329 



Page 242 of 280 



± Q O fcj 3; O *S „, O O E'O IE 




I — 



The incoming messages are examined first 19 
to see if it is supervisor or work/program. 
Work/program messages have address field. 
We check if it is our address. Since we know 
that our address is aligned to our power of 2, 
The address mask (named split mask) 
causes only certain number if upper bits to 
be compared. The lower part of the address 
is passed inside as internal address. The 
upper bits are compared against self- address 

register. This register gets its value during „ - ^ 

enumeration protocol. The lower part of this f comparator^ 
register is always masked,. Hopefully T * 

synthesis will delete the unused bits I 




implementation. 



ours/through 



incoming 
address 

address 
split mask 



- .irepai. 
of self-address 



\ 



part of the address 
that enters the member 
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Member 



IUfJL* Rif_* Rif.o_* moduleJd 

♦ ♦ ♦ 



RIF 



Address Space = 7 



Activation register 



30O 
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Member 



RifJL* Rif * Rif_o_* modu)<;J d 

_i + * * 



RIF 



Address Space = 7 



Activation register 



Fig?. 3"^ 



3oo 
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the second land bridge solves most traffic problems, but 
adds 4 clocks in the overall ring length. This is not a big 
problem because no message should travel the whole 
perimiter. 



3^ + 




The utopia interface is 
forced into mode that 
communicates in 
messages, not cells. We 
using the I/O and maybe 
some of the logic. 



3k 



Application 

Specific 
Accelerators 

CRC 
Encryption 
Table Lookup 
Hashing 



Internal Memory 3^2, 

Fast, Unified, Multi-port 



fl Vobla "*£± 

Network Processor 



Peripheral Expansion 

Enet, ATM, Uart, USB, Serials 



v. 



System 
Expansion 
Area 

CPU (PP) 

DMA 
Smart FIFO 
Ext. mem I/F 



*>SO 
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Internal Memory 



FTU 



r 



L-ifad'Sltrrr 

H LSU 

H 



Program Sequencer 

PSU 



3^4 



t2T 



Prckuddilhimp 

fecal 



Runip 



Vobla Core 



1^ Jla 



Register File 

RFU 



s 



Arithmetic 

DALU 



L^Ja^A , I, I, I. 1_L 



AfertJ W 

AG1 



" Agcnl Bus 



Core Debug 



Doorbell 



DMA Agent 



VobU Compoundi 



31 
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3? 



32 regiatcra e>f 
32 bits per cask , 

A set ot indbcatunii. 
per task, which 
control task execution 
scheduling 

An interface to 
adjacent resources ** 

Fa*t memory accessed 
by load/star*.' 
inMniction* 



if*" 



Ocneral 
purpose 
register 



A&j*«ri 
interface 



Internal 



Special 
purpose 
registers 



P configuration 
igisiers 



Fx tenia I 
memory 



Pi* . 



per task and 
I r casters 



Initial ifed by 
thePP 



Big area access* d 
via a DMA icitertrico 
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Rl register: 



\ 1 1 1 T 1 X 2 2 2 2 11 1 I I I I i i i |ftf'QJi*.S21U 
L b 9 H ^ fc ? + 1 2 J U 9 I ( 1 5 4 J I 1 u 



11111 


ill! 


HUH 


llli 


m 

11 §1 


lililillEEEE 



* - sticky bet 

£C| - equal/ zero 

It - kss thcjvfcieg&iivc 

jEt - greater Ehdtfpcisiiivc 

^ - carry 

nib - reflection of the RAM tnutu-rcfeter busy indication. 



3*2222222222! I» It I I t I 1 * & 7 A S 4 J 2 lil 
1<1«IS765*S2)Q9* , 'A$43IIQ 



M-M RLFETCH 



332222222222 JIII1 It II »9l??65432 I fl 
10*1765432109*7*5-1 » 2 1 » 



ft. 



JKJUH. 
Rhi} 



w 



1RAT "kin 

i*3>i ipd« = 2) 



32222222222211111 1 I I 1 » <» s» 7 «. 5 4 3 2 I o 
I 0 i 4 ? 6 > J 5 2 t A ■ 3 T 6 5 i 1 2 i Q 
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Frame structure 
of an example 
task tyr»o 



43 



r27. 



I&0 



\ common 
task Jala 


task 

fragment t 
data 






task fragment 
2 data 


task 

fragment 3 
data 


data of ail lcvel2 functions 


level 1 fl 
data 


level 1 n 
data 


level 1 ft j 
data 







I sizcoflcvelO 
\ frame jwt is 
| different for 
| each task 

i 

4 MOT of level 2 

I frame part U 
^ constant 

size of level I 
frame part is 
different per 
each task 
type 




FBjy vector 



Jfe|h l ^{ 




APP ID= 10064329 



Page 253 of 280 



7tD 5lO £! 




ZZ> - Flip Flop 



Z> - Logic 
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CRC 



, sno °P 



last in frame 



, crc_stall 
^calculate cr^ 



memory 



Vobla 



mem^stall ^ 
mr^address[1 /: 



!?cm^ata 



ata_outf6 




sourcc_add[l5.0 









memory 




interface 




& data 


M 


packer 


aligner 






>:Uj 




Vobla 




requesi 


sta 


entry 



input 

message 
decoder 



request 
fifo 



& data j~~ ~ldata_ou t [63:01 

packer - ' I 

aligner | I | I 



fdes 



^Tadfl&o; 



data out count c 



>i output 
inessage 
encoder 



typc_ in 
S interlace_address( 1 5 0 
ft; interface dalap 1 :0J 



lype out[7:0]<si^etuCRC ) 

addrcss_out[23 :0] 

data_out[63:0] 
RC 
k 



(alsoqp C RC) 
_ ^Wk 



ok2dnvc 



ft 



memory data 

tJatafaddp 




agent command 
(AID«muttlreaderr 




vobla entry ■ , i i p ■ • s 
In multireader 111 

increment Hr^t wnoop lasl 

destination t , v r ' /imAt 

address <°P 2 > ^ 0 P ] > <<'P°) 

(op3> 



sourcc addr in sram address _of destenation number of bytes. 

totrausfcr 
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agent Interface 

Vobia 



main 
entry 



shadow 
entry 



output 
message 
encoder 



type out[7:0] 
uddress out[23 ;0] 
data^uT[63:0] 



reset 



elk 



ok2drive 



options [°0] 



RA 



agent command I ~~ 

OpcCHJC 

(AID=mes3flge_sender) 

(optlon[6j =0) 

meaaag«_s«ndgf3^> UtO p tion5 | 5 . 0 j | raw j ata[ 3 1 :0 j 



AID[4:0) 



message data 




I raw_address[23:0] | t>pc[7:0j 



opcode 



agent command 

(A!0=me**age_seVm*rp 
(optlon(0] al) 



addrcswrfdcstcoalion message type 



options[Q:0] 



RA 


AID[4:0J 


RB 


+XRA+I I 



6Z 



messag6_*«nder& j 1 .opt»ons|5:0|] raw^datafo"!^" 



raw_addrcss[23:0J 



message data 



10000000 



addressofdestcnation message type 

£3° 



doorbell 



jset maskTO* 



diii set mask I 
d nTset mask 2 



diri^set mask 



Vobla 



igent interfi ce 
stall vobla 



token 
control 



request 
entries 
X2 



I input 
I message 
I decoder 



DMA 
context table 



output 
message 
encoder 



type_in[7:01 



»vru7 interface a Jdr<?<js{23 :0] 
vvnj cjntcr face _data[3 1 :u] 

aseaddrcss[7:0J 
type out[7:0] I I 

addrSTout[23 .0] 
Ia^It[63:0] 



^ok2drivc 
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ag«nt command opcode 
(AIDadma agentj 




i>plions[10:0] 



RA 



RA-L 



dma request] 
entry 

modify urgcntdtr autosend lone dram address 
address (0 P2) (OTl)sct ack addFess ram su3arcss 




RJVimm8 



r 



dram address^ I :Q] | sram address[23:0j J [count[7:0] } p *" 



fit 



(OP9> 



(OPO) (OP3)(OPtO> 



sram address 



number of" bytes 
to transfer or 
address modifyer 



multiraade ■ 



last in transfer 



l^st in fram 3 



calciilatec rc 



it ultireaiflata [3 1 ;0] 
reader 



Vobla 



went jnterfa ca 
stall vobla 



L 



register 




input 


file 




buffer 


^ 





J tx_data_mux 



CRC data 



random 
number 
gcneratoi 
<TRD) 



, on_dmand 



5,10,32 
machine* 



checksum 
machine 



3- 



bip16 
machine 



elk 



reset 



Ft $ ■ 



agent command | opcode 
(AID^CRC agent) 




CRC type data size generateopenat 
check mode 
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igent interfz ce 
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pre scaler 4- 



div 
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time stamp 
register 



reset 
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v_sct_doorbcH n ask 
v next task vafii 



Vobla 



v current task ic[5:0] 
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v current task 
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mask control 
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_rifj_ write 

tfoorbcl ) cs 



,rif i addr(5:0j 
dataf4;01 
3m set maskO 



'dmsctm ask 1 
IdrrTscI masET 



^dm set maskJ 




^ nf i res et 
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agent com man 
(AID*doorbell 
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opcode 



options [9:0] 



RA 



I AIP{4:pi 



RB/immR 



<J>kjndcx[2; 



set'clear clear clear 
global request mask 
(OP2> fOPH (OPO) 



{ 0 A0,0,O,mask_bitJindcx [2:0]} write mask 

{ 0,0,0,0, 1 ,rcq_jbit Jnde\[2:0] } write request 
or 

{ l,0.0,0,0,count_valuc[2:0]| write counter 
or 

{0,1,0,0,0,OAOJ write TGMR 
or 

{1,1 ,0,0 ,(MM>.urgcnt_valuc } write urgent 
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